Note
Click here to download the full example code
The Multistage Forecast Model
This is a tutorial for the Multistage Forecast model. Multistage Forecast is a fast solution designed for more granular time series (for example, minute-level), where a long history is needed to train a good model.
For example, suppose we want to train a model on 2 years of 5-minute frequency data. That’s 210,240 observations. If we directly fit a model to large input data, training time and resource demand can be high (15+ minutes on i9 CPU). If we use a shorter period to train the model, the model will not be able to capture long term effects such as holidays, monthly/quarterly seasonalities, year-end drops, etc. There is a trade-off between speed and accuracy.
On the other hand, if due to data retention policy, we only have data in the original frequency for a short history, but we have aggregated data for a longer history, could we utilize both datasets to make the prediction more accurate?
Multistage Forecast is designed to close this gap. It’s easy to observe the following facts:
Trend can be learned with data at a weekly/daily granularity.
Yearly seasonality, weekly seasonality and holiday effects can be learned with daily data.
Daily seasonality and autoregression effects can be learned with most recent data if the forecast horizon is small (which is usually the case in minute-level data).
Then it’s natural to think of the idea: not all components in the forecast model need to be learned from minute-level granularity. Training each component with the least granularity data needed can greatly save time while keeping the desired accuracy.
Here we introduce the Multistage Forecast algorithm, which is built upon the idea above:
Multistage Forecast trains multiple models to fit a time series.
Each stage of the model trains on the residuals of the previous stages, takes an appropriate length of data, does an optional aggregation, and learns the appropriate components for the granularity.
The final predictions will be the sum of the predictions from all stages of models.
In practice, we’ve found Multistage Forecast to reduce training time by up to 10X while maintaining accuracy, compared to a Silverkite model trained on the full dataset.
A diagram of the Multistage Forecast model flow is shown below.
Next, we will see examples of how to configure Multistage Forecast models.
55 # import libraries
56 import plotly
57 from greykite.framework.templates.forecaster import Forecaster
58 from greykite.framework.templates.autogen.forecast_config import ForecastConfig,\
59 MetadataParam, ModelComponentsParam, EvaluationPeriodParam
60 from greykite.framework.templates.model_templates import ModelTemplateEnum
61 from greykite.framework.benchmark.data_loader_ts import DataLoaderTS
62 from greykite.algo.forecast.silverkite.forecast_simple_silverkite_helper import cols_interact
63 from greykite.framework.templates.multistage_forecast_template_config import MultistageForecastTemplateConfig
Configuring the Multistage Forecast model
We take an hourly dataset as an example. We will use the hourly Washington D.C. bikesharing dataset (source).
73 # loads the dataset
74 ts = DataLoaderTS().load_bikesharing_ts()
75 print(ts.df.head())
76
77 # plot the data
78 plotly.io.show(ts.plot())
Out:
ts date y tmin tmax pn
2010-09-20 12:00:00 2010-09-20 12:00:00 2010-09-20 11 12.8 25.6 0.0
2010-09-20 13:00:00 2010-09-20 13:00:00 2010-09-20 13 12.8 25.6 0.0
2010-09-20 14:00:00 2010-09-20 14:00:00 2010-09-20 8 12.8 25.6 0.0
2010-09-20 15:00:00 2010-09-20 15:00:00 2010-09-20 8 12.8 25.6 0.0
2010-09-20 16:00:00 2010-09-20 16:00:00 2010-09-20 12 12.8 25.6 0.0
The data contains a few years of hourly data. Directly training on the entire dataset may take a couple of minutes. Now let’s consider a two-stage model with the following configuration:
Daily model: a model trained on 2 years of data with daily aggregation. The model will learn the trend, yearly seasonality, weekly seasonality and holidays. For an explanation of the configuration below, see the paper.
Hourly model: a model trained on the residuals to learn short term patterns. The model will learn daily seasonality, its interaction with the
is_weekendindicator, and some autoregression effects.
From Tune your first forecast model we know how to specify
each single model above. The core configuration is specified via
ModelComponentsParam.
We can specify the two models as follows.
97 # the daily model
98 daily_model_components = ModelComponentsParam(
99 growth=dict(
100 growth_term="linear"
101 ),
102 seasonality=dict(
103 yearly_seasonality=12,
104 quarterly_seasonality=0,
105 monthly_seasonality=0,
106 weekly_seasonality=5,
107 daily_seasonality=0 # daily model does not have daily seasonality
108 ),
109 changepoints=dict(
110 changepoints_dict=dict(
111 method="auto",
112 regularization_strength=0.5,
113 yearly_seasonality_order=12,
114 resample_freq="3D",
115 potential_changepoint_distance="30D",
116 no_changepoint_distance_from_end="30D"
117 ),
118 seasonality_changepoints_dict=None
119 ),
120 autoregression=dict(
121 autoreg_dict="auto"
122 ),
123 events=dict(
124 holidays_to_model_separately=["Christmas Day", "New Year's Day", "Independence Day", "Thanksgiving"],
125 holiday_lookup_countries=["UnitedStates"],
126 holiday_pre_num_days=1,
127 holiday_post_num_days=1
128 ),
129 custom=dict(
130 fit_algorithm_dict=dict(
131 fit_algorithm="ridge"
132 ),
133 feature_sets_enabled="auto",
134 min_admissible_value=0
135 )
136 )
137
138 # creates daily seasonality interaction with is_weekend
139 daily_interaction = cols_interact(
140 static_col="is_weekend",
141 fs_name="tod_daily",
142 fs_order=5
143 )
144
145 # the hourly model
146 hourly_model_components = ModelComponentsParam(
147 growth=dict(
148 growth_term=None # growth is already modeled in daily model
149 ),
150 seasonality=dict(
151 yearly_seasonality=0,
152 quarterly_seasonality=0,
153 monthly_seasonality=0,
154 weekly_seasonality=0,
155 daily_seasonality=12 # hourly model has daily seasonality
156 ),
157 changepoints=dict(
158 changepoints_dict=None,
159 seasonality_changepoints_dict=None
160 ),
161 events=dict(
162 holidays_to_model_separately=None,
163 holiday_lookup_countries=[],
164 holiday_pre_num_days=0,
165 holiday_post_num_days=0
166 ),
167 autoregression=dict(
168 autoreg_dict="auto"
169 ),
170 custom=dict(
171 fit_algorithm_dict=dict(
172 fit_algorithm="ridge"
173 ),
174 feature_sets_enabled="auto",
175 extra_pred_cols=daily_interaction
176 )
177 )
Now to use Multistage Forecast,
just like specifying the model components of the Simple Silverkite model,
we need to specify the model components for Multistage Forecast.
The Multistage Forecast configuration is specified via
ModelComponentsParam.custom["multistage_forecast_configs"],
which takes a list of
MultistageForecastTemplateConfig
objects, each of which represents a stage of the model.
The MultistageForecastTemplateConfig object for a single stage takes the following parameters:
train_length: the length of training data, for example"365D". Looks back from the end of the training data and takes observations up to this limit.
fit_length: the length of data where fitted values are calculated. Even if the training data is not the entire period, the fitted values can still be calculated on the entire period. The default will be the same as the training length.
agg_freq: the aggregation frequency in string representation. For example, “D”, “H”, etc. If not specified, the original frequency will be kept.
agg_func: the aggregation function name, default is"nanmean".
model_template: the model template name. This together with themodel_componentsbelow specify the full model, just as when using the Simple Silverkite model.
model_components: the model components. This together with themodel_templateabove specify the full model for a stage, just as when using the Simple Silverkite model.
MultistageForecastTemplateConfig represents the flow of each stage of the model:
taking the time series / residual,
taking the appropriate length of training data, doing an optional aggregation,
then training the model with the given parameters.
Now let’s define the MultistageForecastTemplateConfig object one by one.
210 # the daily model
211 daily_config = MultistageForecastTemplateConfig(
212 train_length="730D", # use 2 years of data to train
213 fit_length=None, # fit on the same period as training
214 agg_func="nanmean", # aggregation function is nanmean
215 agg_freq="D", # aggregation frequency is daily
216 model_template=ModelTemplateEnum.SILVERKITE.name, # the model template
217 model_components=daily_model_components # the daily model components specified above
218 )
219
220 # the hourly model
221 hourly_config = MultistageForecastTemplateConfig(
222 train_length="30D", # use 30 days data to train
223 fit_length=None, # fit on the same period as training
224 agg_func="nanmean", # aggregation function is nanmean
225 agg_freq=None, # None means no aggregation
226 model_template=ModelTemplateEnum.SILVERKITE.name, # the model template
227 model_components=hourly_model_components # the daily model components specified above
228 )
The configurations simply go to ModelComponentsParam.custom["multistage_forecast_configs"]
as a list. We can specify the model components for Multistage Forecast as below.
Note that all keys other than "custom" and "uncertainty" will be ignored.
235 model_components = ModelComponentsParam(
236 custom=dict(
237 multistage_forecast_configs=[daily_config, hourly_config]
238 ),
239 uncertainty=dict()
240 )
Now we can fill in other parameters needed by
ForecastConfig.
246 # metadata
247 metadata = MetadataParam(
248 time_col="ts",
249 value_col="y",
250 freq="H" # the frequency should match the original data frequency
251 )
252
253 # evaluation period
254 evaluation_period = EvaluationPeriodParam(
255 cv_max_splits=0, # turn off cv for speeding up
256 test_horizon=0, # turn off test for speeding up
257 )
258
259 # forecast config
260 config = ForecastConfig(
261 model_template=ModelTemplateEnum.MULTISTAGE_EMPTY.name,
262 forecast_horizon=24, # forecast 1 day ahead
263 coverage=0.95, # prediction interval is supported
264 metadata_param=metadata,
265 model_components_param=model_components,
266 evaluation_period_param=evaluation_period
267 )
268 forecaster = Forecaster()
269 forecast_result = forecaster.run_forecast_config(
270 df=ts.df,
271 config=config
272 )
273
274 print(forecast_result.forecast.df_test.head())
275
276 # plot the predictions
277 fig = forecast_result.forecast.plot()
278 # interactive plot, click to zoom in
279 plotly.io.show(fig)